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A8STRAC T 



The desiqn and implementation of MU NIX, a tightly-coupled 
symmetric multiprocessing POP 11 nased operating system pro* 
vidino real-time, interactive, and background processing 
facilities in a hierarchical memory environment is 
described, MUNIX is a variant of UNIX, an operating system 
for the POP 11 developed at Bell Laboratories. 



The three major design goals of the system were: 

(1) support for processes capable of real-time interaction 
with several dynamic graphics display units, an array 
processor, and a multi-channel A/D converter; 

(2) interactive and background processing facilities to 
support proqram development; and, 

(i) management of the hierarchical storaqe created by the 
mix of shared and private memories of various speeds. 



The resulting MUNIX system provides an effective mechanism 
for resource sharing in a laboratory environment and is the 
basis for protected real-time operation in a multi-user sys- 
tem. 



Keywords: Real-time, 

sharing/ UNIX 



operating systems, POP 11/50, time- 



CR Categories: 



3.80, 4.32, 6.22 



- 1 - 



INTRODUC I ION 



The flavor of a computer operating environment, is often 
derived from its most st rinaent processina reouirements. 
Thus, real-time systems tpnn to Provide very sparse proaram 
development facilities, terminal systems often overlook ade- 
quate background processing mechanisms or real-time support, 
and batch systems tend to avoid all interactive tasks. How- 
ever, we assert that any system seeking to support the ac- 
tive development of real-time tasks needs the best available 
software engineering tools; the MIJNI X operating system is 
the result of our efforts to provide this environment. 

hquipment Configuration 

The configuration of the Signal Processing and Display 
Laboratory is shown in Figure 1. The real-time system can 
be viewed as a three bus ensemble, with the respective func- 
tions of data acquisition, signal processina, and display. 
When bus cycles are not required by real-time processes, the 
data acquisition and display busses support program develop- 
ment activities. The display system includes a 2 5 6 K word 
fixed head disk, a Pamtek color display, a Tektronix 4014 
display with enhanced graphics, a Vector General 3 D 3 I sys- 
tem, a Hughes Cononrarhic console, a data tablet, a Versatek 
printer/plotter, and an EPC araphic recorder. Peripherals 
for the data acquisition controller include both large (96M 
words) and small ( 2 . *5 M words) disk systems, magnetic tapes, 
a card reader, a line printer, and a sixteen line 
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Figure 1. NPS Signal Processing and Display 
Laboratory Equipment Configuration 




programmable terminal multiplexer. Dual ported core memory 
( 8 8 K words) is accessible from either UN I BUS. The siqnal 
processino subsystem consists of a CSP 12b controller with 
4K words of 12b nanosecond memory/ an array processor/ and 
two lbK word banks of three ported memory. 

Operational Factors 

Ihe MU NIX operating environment is a university labora- 
tory enqaqed in research and educational use of computer 
qraphics/ siqnal processinq/ operatina systems/ ana hybrid 
computing. Although several operatinq systems [5/0/7,10] 
are available for use with the PDP-11 computer/ each lacks 
capablity in some dimension which appears important to the 
present ranqe of applications. Iaj e were thus faced with the 
alternative of maintaining several operatinq systems for use 
with the various applications (with the attendant equipment 
scheduling and program conversion problems) or developing a 
unified operating environment with subsystems which provide 
the required specialized support when it is needed. In the 
paragraphs which follow, we present the key features of the 
multi -function operatina system (MU NIX) which provides a 
support environment for the development/ maintenance, and 
operation of real-time programs. 

Since program development is a major portion of our 
workload, we sought a multiprogramming operatinq system 
which would provide us with interactive terminal support, a 
hierarchical file system, and a full complement of program 
development software (editor, compilers, assembler, and 



utilities). Three other major cons i aerations were the ease 
of interfacing new devices to the operatina system, the 
system's support of extended address i na (memory managements 
and the availability of source code. Usi no these criteria, 
the UNIX f 1 0] operatina system was chosen as the basis for 
program development support. 

Unce the decision was made to utilize UNIX as the basic 
operating envi ronment r attention coul a be focused on the 
technical problems associated with providing to our communi- 
ty of users an environment which facilitates the implementa- 
tion and debugging of real-time processes. The problem of 
providing user access to the full set of peripherals on both 
processors while seeking to dynamically balance the UNI BUS 
and processor loads and provide real-time support for the 
display and signal Processing tasks led to the development 
of the symmetric multi-processing operating system discussed 
in subsequent sections. 

Arch i tectural Cons i derat ions 

A single bus architecture such as the POP 11 is not a 
favorable environment for multiprocessing because each pro- 
cessor can only communicate with the peripherals and memory 
on its own bus. One (expensive) method of sol vi nq this 
problem is to buy peripherals which are multi -ported and 
thus capable of communicating with more than one bus; anoth- 
er approach uses special purpose bus switches which toaole 
one peripheral between the two busses. In liqht of the 
presently available switch technology^ this solution was 
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also rejected as economical 1 y infeasible. 

Except for the disk storage unit and core memory, MIJNIX 
swaps processes between the two processors in order to meet 
peripheral access requirements. Thus, the access problem 
was solved without the benefit of special purpose bus 
switches or multi- ported peripherals. bus traffic is spread 
across both busses and only those processes which must ac- 
cess devices on both busses will incur the shared access 
overhead. We believe this solution to be appropriate for use 
in real-time systems in which the real-time device access 
can be confined to a specific bus. 

Another problem which required attention was the con- 
trolled utilization of the several kinds of memory available 
in the system. In a system with a small (18-bit) address 
ranqe* each paae frame is extremely valuable; thus, it is 
desirable for special purpose memory to be available for 
qeneral allocation whenever it is not required for real-time 
operat i on. 
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MULT IPKOCFSSING 



M U N I X is a tight ly-counled symmetric multiprocessor 
operating system which is d e s i a n e d to provide a mechanism 
for bus and processor load balancing as well as a uniform 
user interface to a wide varity of system peripherals. In 
general, the design is similar to other mu 1 t i p r oc e s s o r sys- 
tems 11, 2 , 4 , 9 ] - - the single copy of the system residing in 
shared memory uses P and V operators [8] for synchroniza- 
tion, The processors are completely independent, each going 
its own user process selection from a single ready process 
list. In order to facilitate processor identification, the 
hardware was modified so that the three unused bits in the 
processor status word (bits 8-10) contain a unique processor 
i den t i f i e r , 

As Figure 1 indicates, the hardware is not symmetrical 
with respect to 1/0 devices on the two UNIBUS's. The most 
important devices, bulk memory and the large disk storage 
unit, are dual-ported and can be accessed by processes run- 
ning on either processor; all other devices are single- 
ported and may only be accessed by their host processor. 
Most mu 1 t i processor operatinq systems avoid this problem by 
employing 1/0 controllers or channels which communicate with 
all processors. Since the concept of a UNIBUS communicating 
with more than one processor is foreign to the PDP-11 
hardware design, another solution to the I/O control problem 
had to be found for M U to I X . 
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Processor Affinity 



because the distribution of devices across the two 
busses in the svstem is not symmetric/ the concept of pro - 
cessor affinity was introduced into MUMX. Processor affin- 
ity may be requested by the user or the system, and may b e 
permanent or temporary, and may have be advisory or mandato- 
ry status. The use of each of these types of processor 
affinity is discussed in the following paragraphs. 

It is not feasible to make a Priori determination of the 
I/O device reouirements of all processes nor is it desirable 
to limit a process to the peripherals attached to only one 
bus? therefore/ M IJ N I X supports dynamic process-processor 
affinity. It appears desirable to determine device availa- 
bility in a truly dynamic fashion, with each processor ini- 
tiating all user I/O reguests as thouqh all devices were 
available on both buses. Only when this access attempt had 
failed (as indicated by an addressing error) would the pro- 
cess be passed to the other processor where the I/O would 
aqain be initiated. Unfortunately, much of the system I/O 
set-up has t. o be repeated by the second processor, and in 
addition, the time for the hardware to discover and report 
device nonexistence varies between five and ten microseconds 
and stops all bus activity. This scheme would have made 
device reconfiguration simple but was abandoned because of 
excessive system overhead. 

In the current implementation, MUMX maintains a dynamic 
configuration table which lists the devices on each UN IB US 
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(the multi-ported devices are in both lists). In hen a pro- 
cess requests an I/O operation, Ml) NIX determines if the 
required device can be accessed by the processor which is 
currently servicinq the process. If not, a temporary pro- 
cessor affinity flag is set and the user task is suspended. 
When next scheduling this task for a processor, the system 
uses the temporary affinity f 1 aa to insure that the correct 
processor is chosen. When the 1/(3 operation is completed* 
the temporary affinity flaa is cleared and the process is 
once aqain a candidate for execution by either processor. 

In order to decrease scheduling overhead* a permanent 
processor affinity flag is provided for processes which do 
large amounts of I/O to devices which are accessible from 
only one Pus. lhis flag is set by the user process via a 
system call which specifies the device required by the user; 
MUNIX uses the configuration table to translate this request 
into the appropriate permanent affinity flag. Once set* 
this advisory flag is used by the processor scheduler. As 
lonq as more than one process is ready for execution* a pro- 
cess with the permanent affinity flan will only be scheduled 
on the desired processor. If only one process is ready* 
however* it will be executed by either processor since the 
alternative is an idle processor. A process with the per- 
manent affinity flag set will be temporarily switched in 
order to access an I/O device on the other UNIBUS. Neutral 
processor affinity is the default for non-real-time 
processes. 
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Active Process Stack 



MUNIX solves the problerr of accountinq for a separate 
stack for each processor in a rathpr clean, st rai qht -forward 
manner, With each user process the system associates 1 0 ? 4 
bytes of storaoe. This roqion contains system per-process 
data and the system stack for this process. When attention 
is switched from one process to another, it is sufficient to 
simply chanqe the system stack seqment reqister to point to 
the header area of the new process and reload the stack 
pointer; the state of that process is thereby completely 
restored. This scheme is sufficiently general to allow car’* 
ture of a nested interrupt state as well as the normal user 
state. Since this eleqant scheme was used in the UNIX 
monoprocess i nq system (10), no change was required to allow 
mul t iprocessi nq. 



MEMORY HIERARCHY MAM AGE MEN T 



As indicated in Fiqure 2 * the system primary memory is 
rather unusual. Each processor's memory space consists of 
lbK words of 4 SO nanosecond MOS memory which is private* 1 6 K 
words of ESQ nanosecond core memory which is shared with the 
CSP array processor* and B 8 K woras of BOO nanosecond core 
which is shared between the POP 11/SO processors. The 
operatina system occupies 24K words of the shared memory. 

MUNIX provides each user nroaram with a virtual memory 
of up to 3 K words which is divided by the memory management 
hardware into eight paqes of 4 K words each. Having thirty- 
two paae frames at its disposal* the memory management 
software utilizes a workinci set alaorithm (31 for page re- 
p) acement . 

rthen there are no real-time processes active* the memory 
manager uses both private and shared pane frames uniformly 
except for the restriction that pages from one process can 
not reside simultaneously in both private memories. Obvi- 
ously* a process which has one or more paqes in a private 
memory can only he executed by one Processor. If this pro- 
cess must use the other processor to access a peripheral 
device* any pages in private memory are moved to the shared 
memory before the temporary affinity flag is set. Since 
this move involves significant overhead (the POP 11 has no 
block move instruction)* the system attempts to avoid this 
situation by assigning the private memory to processes which 
have the permanent affinity flan set wherever possible. 
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Figure 2. Memory Hierarchy 



when a process is aranteci real-time status 



the rr err. ory 



manaqement software removes the paqes of all other processes 
from the private memory of the desired processor. If paqe 
frames are available and the process status warrants, these 
paqes are moved to shared memory; otherwise, they are moved 
to the swap file* Once the private memory is cleared, all 
of the paqes of the new real-time process are moved from 
wherever they reside (the other private memory, shared 
memory, or the swap file) ann locked so that they will never 
be swapped. This move may involve a transfer to and from 
the swap file if the paqe happened to reside in the wrong 
private memory. Once moved, the real-time process is exe- 
cuted from the fastest memory on the system. 



PROCtSS CONTROL 



MU NIX supports three types of processes: foreground 
(timeshared)/ background (batch ) , and real-time. Since the 
control and capability of the first two types of processes 
have been reported elsewhere 1103/ this paper will concen- 
trate on real-time process support. 

One of the primary functions of the Siqnal Processing 
and Display Laboratory is to support siqnal acquisition and 
analysis. To accomplish this task/ an analogue signal is 
diqitieeo and then sent to the data acquisition process. 
This process loas the data and does some front end analysis 
before passing it to the CSP IPS with its array processor 
where the siqnal is processed usina Fourier transform tech- 
niques. The transformed data is then passed to a real-time 
display process which presents the data on one or more of 
the graphic display devices in the system. Both the data 
acquisition process and the display process operate under 
severe time constraints. In order to support this type of 
computing/ MUNIX provides a real-time process classifica- 
t i on . 

Real-Time Processes 

All processes beoin as either foreground or background. 
After a process starts execution/ it can reguest/ via a sys- 
tem call/ that it become a real-time process. Since a 
real-time process is by definition attempting to respond to 



some external stimulus (device) 



these processes must be 



executed on the processor whose UN IB US is connected to the 



desired device. Therefore, every real-time process has a 
permanent processor affinity which is mandatory rather than 
ad vi sory. 

When a process requests real-time status, the system 
makes two limit checks. First, it determines the number of 
real-time processes in existence with an affinity for the 
desired processor. If this number is eaual to a maximum 
value (currently one), the request is denied. Second, it 
checks the total amount of memory dedicated to real-time 
processes. If this amount plus the amount required for the 
request inq process is areater than a maximum value (current- 
ly 6 4 K words), the request is denied. These limit checks 
enforce the system policy of not allocating system resources 
to real-time processes to the point of severly deqradinq 
system response to other users. Both of these restrictions 
are merely adm i n i s t r a t i v e , system is concerned althouah the 
policy of a I locating only private memory to real-time 
processes obviously must be discarded if such processes are 
allowed to occupy more than 6 4 K . The problem of multiple 
real-time processes comoetinq for the same processor is 
solved by a dynamic system limit on the number of consecu- 
tive quanta which will be allotted to a real-time process 
when other processes with sufficiently hiqh priority are 
w a i t i n a . 

If the real-time reauest can be granted, the requesting 
process is moved into the private memory of the specfied 
processor (Figure ?), possibly dislocating some nonreal-time 
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processes. After this move/ the process is locked into 
memory so that it is never a candidate for swappino and is 
qiven the highest possible priority. I hus / whenever a 
real-time process becomes ready for execution/ it is preemp- 
tively allocated a processor and keeps possession of this 
processor until it completes or until an T/0 request causes 
it to be blocked. 

Array Procession and Real-Time I/O 

As indicated in Fiqure <?/ the upper 1 6 K words of each 
processor's private memory is shared with the CSP-1P5. In 
order to reduce the overhead involved in communication 
between a real-time process and the CSP/ this memory is made 
a portion of the real-time Process address space. Thus/ a 
process which wishes to communicate with the array processor 
need only store the data in the upper IbK words of its own 
address space and request the operating system to send an 
interrupt. Similarly/ data com i no from the array processor 
is placed directly into the address space of a real-time 
process and thereby saves the system overhead involved in a 
block move/ an input data ready interrupt is also provided. 

Interactive Graphics 

In addition to accomplishing very efficient communica- 
tion with the array processor/ the method chosen for sup- 
porting real-time processes solves a problem which would 
have been difficult to solve in any other manner. One of 
the graphic display units in the system/ a Vector General 
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30 31, is a refresh device which retrieves and interprets a 
display list forty times a Second. The display list con- 
tains not only physical memory addresses to he used in 
direct memory access transfers, hut also instructions which 
Can cause the display to store information anywhere within 
the 3 2 K word seament of memory which contains the display 
list. Althouoh MiJNIX can control seqment access, there is 
no effective mechanism for controlling intra-seqment memory 
references in a user's display list since the hardware ap- 
plies neither memory protection nor relocation to these 
memory accesses. 

Since the display list is Quite 1 arae and complex# it is 
not feasible to have the operating system build a valid list 
from parameters supplied hy the user, or verify a user’s 
list before it is sent to the display controller. However, 
if the user were allowed to send an arbitrary (unchecked) 
display list to the display controller, M U N I X coulo not 
insure the integrity of any other process residing in the 
same 3 2 K memory seament. Our solution to this problem is to 
require a real-time classification for all processes which 
use this display unit. As noted earlier, real-time 
processes are placed in the 32 K word private memory segment 
and all other processes are removed from this area. 
Thereafter, the user process is allowed to specify the 
display list which the operatina system sends to the display 
controller unchecked. The worst consequence of an invalid 
display list in this environment is that it may destroy the 
process which built the list. 



System Partitioning 



Another interesting by-product of the real-time process 
control is a very simple method for dynamic system parti- 
tioning. Since the private memory of each processor is the 
low order 3e?h words of address space/ it is very easy to 
separate one processor/ its private memory/ and its I/O dev- 
ices. Once separated/ this portion of the hardware can be 
used to run other operating systems or to test stand-alone 
programs. In tact, the stand-alone program or operating 
system can be built in the MtJM IX environment/ loaded into 
the private memory as a real-time process and then be given 
complete control of the hardware environment as MUNIX re- 
verts to a monoprocess i no system to continue serving its 
other users. 
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CONCLUSIONS AMO FUTURES 



Uur major efforts to date have been toward implementa- 
tion of a loaically correct M U NT X system which makes only 
thoses chanoes to UNIX reouired hy the structural differ- 
ences of the present operat inn environment. The limited 
experience we have had to date on N ' U N I X has served to con- 
firm our basic desion decisions. Support ina all facets of 
real-time programming within a uniform environment has 
greatly simplified the overall system desion. 

With each task taking its respective place in a service 
hierarchy, the available system capacity can be dynamically 
allocated to the priority process, thereby avoiding worst 
case a priori resource allocation. This approach has had 
the further advantaoe of providing a vehicle for rapid (less 
than six months elapsed time) development of a sophisticated 
system by a small programming aroup (2 faculty, 6 qraduate 
students) . 

Uther completed work includes the development of a 
dynamic symbolic debugging tool, development of numerous 
on-line diagnostic packages and I/O device drivers, develop- 
ment of a line eciitor which facilitates correction of typing 
mistakes on the interactive level, and enhancements to the 
text editor, the text. processor, and the linkinq loader. 
Work presently underway includes a performance measurement 
subsystem, several adaptive schedulers, a virtual machine 
monitor, and a hardened file system. 
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